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•  Snort 

-  Signature-based  alerts 

-  Pre-processor  alerts 

•  Origin 

-  Multiple  networks  of  varying  size 

•  Volume 

-  ~30-50  million  alerts  per  month 

•  Ancillary  Information 

-  Country  code 

-  Netblock 


©  2004  by  Carnegie  Mellon  University 


3 


=  Carnegie  Mellon 

Software  Engineering  Institute 

IDS  Data:  challenges 


CERT 

Situational 

Awareness 


•  No  new  attacks 

-  Only  matches  known  signatures 

•  Lack  of  context 

-  Don’t  know  what  we  are  not  seeing 

•  Non-standardized  signature  rule  sets 

-  No  administrative  control 

•  Missing  Data 

-  Uncertainty:  Sensor  failure  vs.  no  intrusion  attempts 
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TCP  Destination  Port  Changes 


Comparison  of  port  activity  across  organizations  shows  monthly 
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Share  of  new  daily  source  IP  addresses  stays  fairly  consistent. 
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Transition  probabilities  highlight  sequential  patterns  in  data. 


•  Current  State 

-  Source  IP  records  alert  on 
Destination  IP 

•  Transition  probability 

-  Percent  chance  for  next 
class  of  alert  recorded 

•  Most  source/dest  combos 
involve  only  one  signature 
class 

•  Small  transition  probabilit 
for 
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Daily  Transition  Probabilities 


Transition  probabilities  can  be  monitored  over  time  to 
identify  consistent  sequences. 
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•  Current  month  vs. 
previous  month 

-  Across  organizations 

-  %  changes 


•  Time  Series 

-  Fit  trend  line 

-  Arbitrary  time  period 

-  Seasonal  Components 

-  Regression  with  ARMA 
errors 
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•  Goal:  Identify  data  points  which 
deviate  from  overall  pattern  of 
data 

•  Our  current  implementation 
(Figure  of  Merit) 

-  Evaluate  hours 

-  Record  #  alerts,  #  source  IP 
addresses,  #  destination  IP 
addresses,  #  signatures 

•  For  each  hour,  we  want 
measure  of  how  deviant  it  was. 
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•  Compute  distance  metric 
between  each  hour  and  the 
average  hour 

•  When  measuring  Euclidean 
(  lahalanobis)  Distance,  all 
points  along  circle  (ellipse)  are 
same  distance  from  the  center 

-  Points  on  larger  circle/ellipse 
are  greater  distance  from 
center 

•  Shape  of  the  ellipse 

-  Function  of  correlation 
between  variables 

•  Generalizes  to  n  dimensions 
(Ellipsoid) 


Mahalanobis  Distance  (based  on  ellipse) 


Euclidean  Distance  (based  on  cj. 
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•  Incorporate  flow  data 

•  Automating  trend  detection 

-  Time  series  analysis 

•  Clustering 

-  Group  sources  by  similar  activity  patterns 

-  Temporal  correlation 

-  Targeting  similarities 

-  Signature  usage 

-  Look  for  evidence  of  possible  coordination 
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